stdlib: change the console to UTF-8 on start#40632
Conversation
This adjusts the Windows console to switch the codepage to UTF-8. This is important as the default codepage (CP437) does not allow for UTF-8 output, but expects ASCII. However, strings in Swift are assumed to be UTF-8, which means that there is now a conversion mismatch. Because the console mode persists beyond the duration of the application as it is state local to the console and not the C runtime, we should restore the state of the console before termination. We do this by registering a termination handler via `atexit`. This means that an abnormal termination (e.g. via `fatalError`) will irrevocably alter the state of the console (interestingly enough, `chcp` will still report the original console codepage even though the console will internally be set to UTF-8). Fixes: SR-13807
|
@swift-ci please test |
|
@compnerd Do you know where that additional space comes in front of the emoji comes from? |
|
I suspect the font. It appears in the editor as well, though there is no explicit space there. Most likely it is adding padding for the combination. |
There is no space there that is being emitted. That is a rendering artifact. It may also be a bug in the rendering for Windows terminal - it might treat the combining as a glyph and offset a cell when rendering. |
|
I bet there should be an input ( |
|
Bad news on that front:
|
|
Right, I don't think that the input code page would be helpful which is why I didn't add that in, you would want Edit: we might be able to do something by using |



This adjusts the Windows console to switch the codepage to UTF-8. This
is important as the default codepage (CP437) does not allow for UTF-8
output, but expects ASCII. However, strings in Swift are assumed to be
UTF-8, which means that there is now a conversion mismatch.
Because the console mode persists beyond the duration of the application
as it is state local to the console and not the C runtime, we should
restore the state of the console before termination. We do this by
registering a termination handler via
atexit. This means that anabnormal termination (e.g. via
fatalError) will irrevocably alter thestate of the console (interestingly enough,
chcpwill still report theoriginal console codepage even though the console will internally be set
to UTF-8).
Fixes: SR-13807
Replace this paragraph with a description of your changes and rationale. Provide links to external references/discussions if appropriate.
Resolves SR-NNNN.